Skip to content

[opt] adapt cache blend for store and sparse's new version#664

Merged
mag1c-h merged 4 commits intoModelEngine-Group:developfrom
wuhuxiao:dev_gsa_device_pr_whx
Jan 23, 2026
Merged

[opt] adapt cache blend for store and sparse's new version#664
mag1c-h merged 4 commits intoModelEngine-Group:developfrom
wuhuxiao:dev_gsa_device_pr_whx

Conversation

@wuhuxiao
Copy link
Copy Markdown
Contributor

@wuhuxiao wuhuxiao commented Jan 22, 2026

Purpose

What this PR does / why we need it?

fix cache blend's function

Modifications

Does this PR introduce any user-facing change?

add nvtx for blend

Test

How was this patch tested?

DATA_DIR=/home/data/kv_cache
MODEL_PATH=/home/models/mistralai/Mistral-7B-Instruct-v0.2
BLEND_DATASET_PATH=/home/datasets/LongBench/data/2wikimqa.jsonl
cd unified-cache-management
python examples/offline_inference_blend.py
image

@wuhuxiao wuhuxiao force-pushed the dev_gsa_device_pr_whx branch from 3b067b6 to 5e7110c Compare January 23, 2026 08:27
@mag1c-h mag1c-h merged commit 6000d75 into ModelEngine-Group:develop Jan 23, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants